Less is More: Active Learning with Support Vector Machines

نویسندگان

  • Greg Schohn
  • David Cohn
چکیده

We describe a simple active learning heuristic which greatly enhances the generalization behavior of support vector machines (SVMs) on several practical document classification tasks. We observe a number of benefits, the most surprising of which is that a SVM trained on a wellchosen subset of the available corpus frequently performs better than one trained on all available data. The heuristic for choosing this subset is simple to compute, and makes no use of information about the test set. Given that the training time of SVMs depends heavily on the training set size, our heuristic not only offers better performance with fewer data, it frequently does so in less time than the naive approach of training on all available data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels

The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...

متن کامل

Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM

Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...

متن کامل

Separating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir

The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...

متن کامل

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

Application of Machine Learning Approaches in Rainfall-Runoff Modeling (Case Study: Zayandeh_Rood Basin in Iran)

Run off resulted from rainfall is the main way of receiving water in most parts of the World. Therefore, prediction of runoff volume resulted from rainfall is getting more and more important in control, harvesting and management of surface water. In this research a number of machine learning and data mining methods including support vector machines, regression trees (CART algorithm), model tree...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000